As we all know how this virus is spreading very dangerously in the communities and localities. While kaggle community is working hard to show how is the spreading trend is in the different countries of the world .The trends of COVID in india is not lesser known as it is spreading very rapidly in the indian subcontinent but measures taken by Indian government are good and essential but still we should prepare for a long fight against this pandemic until scientists all around the world could find a breakthrough.Until than STAY INDOORS and follow the guidelinces of the authorities. STAY SAFE

Please UPVOTE this kernel if you like it. It motivates me to produce more quality content :)

Please do comment what your views are and what you understand from this.

Dont't forget to give your suggestions in the comment section

This notebook is my contribution to show how the medical situations are in India and what are estimates and trends.
Credit - Whole Kaggle community but special thanks to

  1. Logistic Curve Fitting - Global Covid-19 Confirmed ~ by Daner Ferhadi for curve fitting

It might take some time to load

In [1]:
!pip install chart_studio
!pip install pgeocode
Collecting chart_studio
  Downloading chart_studio-1.1.0-py3-none-any.whl (64 kB)
     |████████████████████████████████| 64 kB 960 kB/s eta 0:00:011
Requirement already satisfied: requests in /opt/conda/lib/python3.6/site-packages (from chart_studio) (2.22.0)
Requirement already satisfied: plotly in /opt/conda/lib/python3.6/site-packages (from chart_studio) (4.5.4)
Requirement already satisfied: retrying>=1.3.3 in /opt/conda/lib/python3.6/site-packages (from chart_studio) (1.3.3)
Requirement already satisfied: six in /opt/conda/lib/python3.6/site-packages (from chart_studio) (1.14.0)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.6/site-packages (from requests->chart_studio) (1.24.3)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests->chart_studio) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests->chart_studio) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from requests->chart_studio) (2019.11.28)
Installing collected packages: chart-studio
Successfully installed chart-studio-1.1.0
Collecting pgeocode
  Downloading pgeocode-0.2.1-py2.py3-none-any.whl (7.6 kB)
Requirement already satisfied: pandas in /opt/conda/lib/python3.6/site-packages (from pgeocode) (0.25.3)
Requirement already satisfied: numpy in /opt/conda/lib/python3.6/site-packages (from pgeocode) (1.18.2)
Requirement already satisfied: requests in /opt/conda/lib/python3.6/site-packages (from pgeocode) (2.22.0)
Requirement already satisfied: pytz>=2017.2 in /opt/conda/lib/python3.6/site-packages (from pandas->pgeocode) (2019.3)
Requirement already satisfied: python-dateutil>=2.6.1 in /opt/conda/lib/python3.6/site-packages (from pandas->pgeocode) (2.8.1)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.6/site-packages (from requests->pgeocode) (1.24.3)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from requests->pgeocode) (3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/lib/python3.6/site-packages (from requests->pgeocode) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from requests->pgeocode) (2019.11.28)
Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.6/site-packages (from python-dateutil>=2.6.1->pandas->pgeocode) (1.14.0)
Installing collected packages: pgeocode
Successfully installed pgeocode-0.2.1
In [2]:
import numpy as np
import pandas as pd

import plotly.graph_objects as go
import plotly.express as px
import plotly.io as pio
pio.templates.default = "plotly_dark"
from plotly.subplots import make_subplots

from pathlib import Path
data_dir = Path('../input/covid19-corona-virus-india-dataset')

import os
os.listdir(data_dir)
Out[2]:
['tests_daily.csv',
 'complete.csv',
 'zones.csv',
 'tests_latest_state_level.csv',
 'district_level_latest.csv',
 'web_scraping.ipynb',
 'patients_data.csv',
 'state_level_latest.csv',
 'nation_level_daily.csv',
 'api.ipynb']
In [3]:
df = pd.read_csv('../input/covid19-corona-virus-india-dataset/patients_data.csv')
/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py:3063: DtypeWarning:

Columns (4,10,11,13,20) have mixed types. Specify dtype option on import or set low_memory=False.

In [ ]:
from IPython.display import display, HTML
display(HTML(df.tail().to_html()))

Shows the information about the number of patients this dataset is updated in every 24 hours till today we see their are 11512 cases of covid in india

a little info - i started making this notebook yesterday and there where about near 9500 patients so you can see how sevierly the number is increasing

In [4]:
def state_wise_patients(name,df=df):
    data = df.loc[df['detected_state']==name]
    df = data[['patient_number','date_announced','detected_state']]
    data = df.groupby('date_announced')['patient_number'].nunique()
    data = data.reset_index()
    data['date_announced']=pd.to_datetime(data['date_announced'],format = '%d/%m/%Y')
    data = data.sort_values(by=['date_announced'], ascending=True)
    data['patient_number'] = data.patient_number.cumsum()
    return data

This function segregates and collects the data of different states with dates and patient_number

In [5]:
collection = {}
for i in df.detected_state.unique():
    collection['patients in '+ str(i)] = state_wise_patients(i)

for instance see collection['patients in Maharashtra']

In [6]:
collection['patients in Maharashtra']
Out[6]:
date_announced patient_number
16 2020-03-09 2
19 2020-03-10 5
21 2020-03-11 11
23 2020-03-12 14
25 2020-03-13 17
... ... ...
9 2020-05-05 8353
11 2020-05-06 8392
13 2020-05-07 8430
15 2020-05-08 8463
18 2020-05-09 8490

62 rows × 2 columns

In [7]:
keys = list(collection.keys())
In [8]:
visible_True=[]
for i in range(len(keys)):
    visible_True.append(True)
def t2f(i):
    visible = []
    for a in range(len(keys)):
        if a == i:
            visible.append(True)
        else:
            visible.append(False)
    return visible
In [9]:
def create_buttons(keys=keys):
    l=[dict(label = 'All',
                  method = 'update',
                  args = [{'visible': visible_True},
                          {'title': 'All',
                           'showlegend':True}])]
    for i in range(len(keys)):
        l.append(dict(label = keys[i],
                  method = 'update',
                  args = [{'visible': t2f(i)}, # the index of True aligns with the indices of plot traces
                          {'title': keys[i],
                           'showlegend':True}]))
    return l
In [10]:
fig = go.Figure()
keys = list(collection.keys())
for column in collection:
    fig.add_trace(
        go.Line(
            x = collection[column].date_announced,
            y = collection[column].patient_number,
            name = column
        )
    )
    
#fig.update_layout(updatemenus=[go.layout.Updatemenu( active=0,buttons=list(create_buttons()))])

fig.show()
/opt/conda/lib/python3.6/site-packages/plotly/graph_objs/_deprecations.py:385: DeprecationWarning:

plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.


we will analyze these states

  1. Maharashtra
  2. Uttar pradesh
  3. Rajasthan
  4. Tamil Nadu
  5. Delhi

    because of these states in total comprises of 60% of india's covid +ve cases

1. Findings - Number of Patients

you can see the plots of these states either by clicking the legend or by the dropdown list

  1. As you have seen in maharasshtra the situation started getting bad in the period of 29 - 3 april and then their is near exponancial growth shows that their might had been an outbreak or a system failure
  2. Though the condition is uttar pradesh was not that harsh as it was in maharashtra but the conditions goes out of handle in the period of 29 - 5 april when cluster spreading starts
  3. In rajasthan Same period of 29 - 5 april is time when it starts to grow in an exponancial manner
  4. Okay see again the time bound of 29 - 5 april here we se a sudden growth 30 March(67) - 5 april(571) 9x time growth
  5. Delhi is too different as the period was 1-5 april when the outbreak happens
In [11]:
def per_day_inc(collection=collection):
    for i in list(collection.keys()):
        collection[i]['prev_patients'] = collection[i]['patient_number'].shift(1)
        collection[i]['new_patients'] = collection[i]['patient_number'] - collection[i]['prev_patients']
    return collection
coll1 = per_day_inc()
In [12]:
fig = go.Figure()
keys = list(collection.keys())
for column in collection:
    fig.add_trace(
        go.Line(
            x = collection[column].date_announced,
            y = collection[column].new_patients,
            name = column
        )
    )
    
fig.update_layout(updatemenus=[go.layout.Updatemenu(active=0,buttons=list(create_buttons()))])

fig.show()

2. Findings - Daily increase in Patients

you can see the plots of these states either by clicking the legend or by the dropdown list

  1. In Maharastra the daily increase in number of patients were 117 till 8 April but the sudden increase is noted on april 9 and april 13
  2. Thought uttar pradesh saw a spike on april 4 and after 11 we see peaks on 13 april(75) and 14 april(102)
  3. Well in rajasthan we see spike on 5 april but latter we saw spikes on 9 and after this day we saw sudden growth of 317 in number of patients
  4. Tamil Nadu see unprecedental increase on 1 April around 100 patients +ve in a single day explains why 29 - 5 April period in tamil nadu was severe
  5. Though Delhi has seen up and down in the daily rate of +ve cases but 13 april saw 356 patients in a day
In [15]:
fig = go.Figure()
keys = list(collection.keys())
for column in collection:
    fig.add_trace(
        go.Line(
            x = collection[column].date_announced,
            y = collection[column].patient_number,
            name = column
        )
    )
    
fig.update_layout(yaxis_type='log')

fig.show()
  1. In log scale we see a linear growth in number of patients from 15 of March
  2. Same linear growth is observed in Uttar pradesh from 8 March
  3. Rajathan growth pattern matches the Uttar pradesh but rate is more
  4. 18 March is the day when patient number starts grow in an exponancial manner
  5. Delhi is also follows the same trend describing that Covid increases Exponancially
In [16]:
import requests
india_data_json = requests.get('https://api.rootnet.in/covid19-in/unofficial/covid19india.org/statewise').json()
df_india = pd.io.json.json_normalize(india_data_json['data']['statewise'])
df_india = df_india.set_index("state")
In [17]:
total = df_india.sum()
total.name = "Total"
df_t = pd.DataFrame(total,dtype=float).transpose()
df_t["Mortality Rate (per 100)"] = np.round(100*df_t["deaths"]/df_t["confirmed"],2)
In [18]:
df_india["Mortality Rate (per 100)"]= np.round(np.nan_to_num(100*df_india["deaths"]/df_india["confirmed"]),2)
df_india.style.background_gradient(cmap='Blues',subset=["confirmed"])\
                        .background_gradient(cmap='Reds',subset=["deaths"])\
                        .background_gradient(cmap='Greens',subset=["recovered"])\
                        .background_gradient(cmap='Purples',subset=["active"])\
                        .background_gradient(cmap='plasma',subset=["Mortality Rate (per 100)"])
Out[18]:
confirmed recovered deaths active Mortality Rate (per 100)
state
Maharashtra 33053 7688 1198 24167 3.62
Gujarat 11380 4499 659 6222 5.79
Tamil Nadu 11760 4172 82 7506 0.7
Delhi 10054 4485 160 5409 1.59
Rajasthan 5375 3068 133 2174 2.47
Madhya Pradesh 4977 2403 248 2326 4.98
Uttar Pradesh 4464 2636 112 1716 2.51
West Bengal 2677 959 238 1480 8.89
Andhra Pradesh 2432 1552 50 830 2.06
Punjab 1964 1366 35 563 1.78
Telangana 1551 992 34 525 2.19
Bihar 1392 473 9 910 0.65
Jammu and Kashmir 1183 575 13 595 1.1
Karnataka 1246 530 37 678 2.97
Haryana 912 563 14 335 1.54
Odisha 876 277 4 595 0.46
Kerala 631 497 4 130 0.63
Jharkhand 225 113 3 109 1.33
Chandigarh 196 54 3 139 1.53
Tripura 165 85 0 80 0
Assam 105 42 3 58 2.86
Uttarakhand 93 52 1 40 1.08
Himachal Pradesh 85 41 3 38 3.53
Chhattisgarh 85 59 0 26 0
Ladakh 43 24 0 19 0
Andaman and Nicobar Islands 33 33 0 0 0
Goa 31 7 0 24 0
Puducherry 17 9 0 8 0
Meghalaya 13 12 1 0 7.69
Manipur 7 2 0 5 0
Mizoram 1 1 0 0 0
Arunachal Pradesh 1 1 0 0 0
Dadra and Nagar Haveli and Daman and Diu 1 1 0 0 0
Nagaland 0 0 0 0 0
Lakshadweep 0 0 0 0 0
Sikkim 0 0 0 0 0

the above table is self explainatory it shows the most essential stats about the COVID situation (confirmed recovered deaths active Mortality Rate (per 100)) with darker color means high values

In [19]:
trace1 = go.Bar(
                x = df_india.index,
                y = df_india.deaths,
                name = "deaths",
                marker = dict(color = 'rgba(255, 174, 255, 0.5)',
                             line=dict(color='rgb(0,0,0)',width=1.5)),
                text = df_india.index)
# create trace2 
trace2 = go.Bar(
                x = df_india.index,
                y = df_india.recovered,
                name = "recovered",
                marker = dict(color = 'rgba(255, 255, 128, 0.5)',
                              line=dict(color='rgb(0,0,0)',width=1.5)),
                text = df_india.index)
trace3 = go.Bar(
                x = df_india.index,
                y = df_india.active,
                name = "active",
                marker = dict(color = 'rgba(0, 174, 174, 0.5)',
                             line=dict(color='rgb(0,0,0)',width=1.5)),
                text = df_india.index)
data = [trace1, trace2,trace3]
layout = go.Layout(barmode = "group")
fig = go.Figure(data = data, layout = layout)
fig.update_layout(yaxis_type="log")
fig.show()

Simple bar chart in (log scale)showing the huge difference in the active and recovered cases

In [20]:
import plotly.express as px
import numpy as np
fig = px.scatter_3d(df_india, x='recovered', y='active', z='confirmed',size='confirmed',  color=df_india.index)
fig.update_layout(height=800, width=800,scene_zaxis_type="log",scene_yaxis_type="log",scene_xaxis_type="log")
fig.show()

3D scatter plot showing a great pattern all of the data points are aligned in a linear fashion reason ??(tell in comments)

In [21]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

labels = df_india.index

# Create subplots: use 'domain' type for Pie subplot
fig = make_subplots(rows=1, cols=2, specs=[[{'type':'domain'}, {'type':'domain'}]])
fig.add_trace(go.Pie(labels=labels, values=df_india.deaths, name="deaths"),
              1, 1)
fig.add_trace(go.Pie(labels=labels, values=df_india["Mortality Rate (per 100)"], name="Mortality Rate (per 100)"),
              1, 2)

# Use `hole` to create a donut-like pie chart
fig.update_traces(hole=.4, hoverinfo="label+percent+name")

fig.update_layout(
    title_text="Covid-19 ",
    # Add annotations in the center of the donut pies.
    annotations=[dict(text='deaths', x=0.18, y=0.5, font_size=20, showarrow=False),
                 dict(text='Mortality', x=0.82, y=0.5, font_size=20, showarrow=False)])
fig.show()

4. Findings - Deaths and Mortality

  1. India saw around 38.7% deaths in maharashtra and 22% in Gujarat
  2. West Bengal ,Maharashtra ,Punjab, Madhya Pradesh mortality rate is significantly high as compared to other states (the mortality rate is per 100 patients) ignoring Meghalaya because of low data available
In [22]:
data_dir = Path('/kaggle/input/hospitalbedloc')
In [23]:
import folium
india = folium.Map(location=[20.5937,78.9629], zoom_start=5.4)
df = pd.read_csv('../input/covid19-corona-virus-india-dataset/complete.csv')
for lat, lon,State,Death,Total_confirmed_cases in zip(df['Latitude'], df['Longitude'],df['Name of State / UT'],df['Death'],df['Total Confirmed cases']):
    folium.CircleMarker([lat, lon],
                        radius=5,
                        color='Blue',
                      popup =('State:' + str(State) + '<br>'
                             'Total Confirmed cases:' + str(Total_confirmed_cases) + '<br>',
                              'Deaths :' + str(Death) +'<br>'
                             ),
                        fill_color='red',
                        fill_opacity=0.7 ).add_to(india)
india
Out[23]:

geo spacial veiw of patients in different states of india (hover to view)

Now we will analyze the Hospitals and beds available in different states

In [24]:
df = pd.read_csv(data_dir/'HospitalBedsIndiaLocations.csv')
df = df.fillna(0)
df['total_Beds'] = df['NumUrbanBeds_NHP18'] + df['NumRuralBeds_NHP18'] + df['NumPublicBeds_HMIS'] 
df['total_Hospitals'] = df['NumUrbanHospitals_NHP18'] + df['NumRuralHospitals_NHP18'] + df['NumSubDistrictHospitals_HMIS'] + df['NumDistrictHospitals_HMIS']
In [25]:
#india = folium.Map(location=[20.5937,78.9629], zoom_start=5.4,max_zoom=10)
for i in range(len(df['State/UT'].values)):
    folium.Circle(
        location=[df.loc[i]['Latitude'], df.iloc[i]['Longitude']],
        tooltip = "<h5 style='text-align:center;font-weight: bold'>"+ str(df.iloc[i]['State/UT'])+"</h5>"+
        "<div style='text-align:center;'>"+"total_Beds:   " + str(df.iloc[i]['total_Beds'])+"</div>"+
        "<div style='text-align:center;'>"+"total_Hospital:   " + str(df.iloc[i]['total_Hospitals'])+"</div>"+
        "<hr style='margin:10px;'>"+
        "<ul style='color: #444;list-style-type:circle;align-item:left;padding-left:20px;padding-right:20px'>"+
        "<li>NumSubDistrictHospitals_HMIS: "+str(df.iloc[i]['NumSubDistrictHospitals_HMIS'])+"</li>"+
        "<li>NumDistrictHospitals_HMIS:   "+str(df.iloc[i]['NumDistrictHospitals_HMIS'])+"</li>"+
        "<li>TotalPublicHealthFacilities_HMIS:   "+str(df.iloc[i]['TotalPublicHealthFacilities_HMIS'])+"</li>"+       
        "<li>NumPublicBeds_HMIS:   "+str(df.iloc[i]['NumPublicBeds_HMIS'])+"</li>"+
        "<li>NumRuralHospitals_NHP18:   "+str(df.iloc[i]['NumRuralHospitals_NHP18'])+"</li>"+
        "<li>NumRuralBeds_NHP18:   "+str(df.iloc[i]['NumRuralBeds_NHP18'])+"</li>"+
        "<li>NumUrbanHospitals_NHP18:   " + str(df.iloc[i]['NumUrbanHospitals_NHP18'])+"</li>"+
        "<li>NumUrbanBeds_NHP18:   " + str(df.iloc[i]['NumUrbanBeds_NHP18'])+"</li>"+
        "<li>NumCommunityHealthCenters_HMIS:   " + str(df.iloc[i]['NumCommunityHealthCenters_HMIS'])+"</li>"+
        "</ul>",
        radius=int((np.log2(df.iloc[i]['total_Beds']+1))*6000),
        color=df['State/UT'].values[i],
        fill_color='red',
        fill=True).add_to(india)
india
Out[25]:

Geo spacial view about the number of medical care facilities available in indian states (hover to view)

In [26]:
data_dir1 = Path('/kaggle/input/covid19-in-india')
In [27]:
df1 = pd.read_csv(data_dir1/'AgeGroupDetails.csv')
In [28]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

labels = df1.AgeGroup
values = df1.TotalCases
# Create subplots: use 'domain' type for Pie subplot
fig = make_subplots(rows=1, cols=2, specs=[[{'type':'xy'}, {'type':'domain'}]])
fig.add_trace(go.Bar(x=labels, y=values, name="bar",marker = dict(color = 'rgba(0, 174, 174, 0.5)',
                             line=dict(color='rgb(0,0,0)',width=1.5)),
                text = labels),
              1, 1)
fig.add_trace(go.Pie(labels=labels, values=values, name="pie"),
              1, 2)


fig.update_layout(
    title_text="Covid-19 Age group details ")
    # Add annotations in the center of the donut pies.
fig.show()

5. Findings Age Group

  1. As you can see the patients are most in the region of 20-29,30-39 age group
  2. 20-29 region comprises of 24.9% of the people that are infected while 30-39 capture 21.1%
    shows that people in the these 2 regions are more infected but nothing much can be said as age group data is less avialable

Using pgeocode library to get latitude and longitude from the pincode

Here i have tried to show the distribtution of ICMR labs around the states

In [29]:
import pgeocode
df1 =  pd.read_csv('../input/covid19-in-india/ICMRTestingLabs.csv')
nomi = pgeocode.Nominatim('IN')
lat = []
long = []
for i in df1.pincode.values:
    lat.append(nomi.query_postal_code(str(i)).latitude)
    long.append(nomi.query_postal_code(str(i)).longitude)
In [30]:
df1['lat'] = pd.Series(lat)
df1['long'] = pd.Series(long)
df1 = df1.dropna()
df1 = df1.reset_index()
In [31]:
import folium
#india = folium.Map(location=[20.5937,78.9629], zoom_start=5.4,max_zoom=10)
for i in range(len(df1)):
    folium.Circle(
        location=[df1.loc[i]['lat'], df1.iloc[i]['long']],
        tooltip = "<h5 style='text-align:center;font-weight: bold'>"+ str(df1.iloc[i]['state'])+"</h5>"+
        "<div style='text-align:center;'>"+"type:   " + str(df1.iloc[i]['type'])+"</div>"+
        "<div style='text-align:center;'>"+"city:   " + str(df1.iloc[i]['city'])+"</div>"+
        "<hr style='margin:10px;'>"+
        "<ul style='color: #444;list-style-type:circle;align-item:left;padding-left:20px;padding-right:20px'>"+
        "<li>lab  : " + str(df1.iloc[i]['lab'])+"</li>"+
        "<li>address : " + str(df1.iloc[i]['address'])+"</li>"+
        "</ul>",
        radius=30000,
        color=df1['state'].values[i],
        fill_color='red',
        fill=True).add_to(india)
india
Out[31]:

This Map plot shows where are the labs are situated in different regions of india and which state has more labs(Hover to see)

Now i have described how the trends of testing and +ve cases connects with each other

data is till 27th of April

In [32]:
df1 = pd.read_csv('../input/covid19-in-india/ICMRTestingDetails.csv')
In [33]:
import plotly.graph_objects as go
x1 = df1.DateTime.values
x=df1.TotalSamplesTested.values
y = df1.TotalIndividualsTested.values
z = df1.TotalPositiveCases

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=x1, y=z,
    hoverinfo='x+y',
    mode='lines',
    line=dict(width=0.5, color='rgb(256, 0, 256)'),
    stackgroup='one',
    name = 'TotalPositiveCases'
))
fig.add_trace(go.Scatter(
    x=x1, y=x,
    hoverinfo='x+y',
    mode='lines',
    line=dict(width=0.5, color='rgb(0, 256, 256)'),
    stackgroup='one',
    name ='TotalSamplesTested'
))

fig.update_layout()
fig.show()

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=x1, y=z,
    hoverinfo='x+y',
    mode='lines',
    line=dict(width=0.5, color='rgb(256, 0, 256)'),
    stackgroup='one',
    name = 'TotalPositiveCases'
))
fig.add_trace(go.Scatter(
    x=x1, y=x,
    hoverinfo='x+y',
    mode='lines',
    line=dict(width=0.5, color='rgb(0, 256, 256)'),
    stackgroup='one',
    name ='TotalSamplesTested'
))

fig.update_layout(yaxis_type="log")
fig.show()

6. Findings Sample tested

  1. As you can see in the 1st plot the number of total samples tested are increased in a exponancial ways started(from 29 - March)
  2. Though the testing rate is increasing but still number of tests are very low in comparison to the 10% population of india
  3. In the second plot you can witness how is the increase in TotalSample leads to increase in Total+ve cases in a same manner leading us to infer that more and more testing should be done
In [34]:
df = pd.read_csv(data_dir/'HospitalBedsIndiaLocations.csv')
df =df.fillna(0)
df['total_Beds'] = df['NumUrbanBeds_NHP18'] + df['NumRuralBeds_NHP18'] + df['NumPublicBeds_HMIS'] 
df['total_Hospitals'] = df['NumUrbanHospitals_NHP18'] + df['NumRuralHospitals_NHP18'] + df['NumSubDistrictHospitals_HMIS'] + df['NumDistrictHospitals_HMIS']
df.index=df['State/UT']
df = df.drop(columns=['Sno','State/UT'])

Aggregating the Beds and Hospitals in different states

In [35]:
df1 = pd.read_csv('../input/covid19-in-india/population_india_census2011.csv')
df1 = df1.sort_values(by='State / Union Territory')
df1 = df1.reset_index()
df1.index = df1['State / Union Territory']
df1 = df1.drop(columns=['index','Sno','State / Union Territory'])
from IPython.display import display, HTML
display(HTML(df1.to_html()))
Population Rural population Urban population Area Density Gender Ratio
State / Union Territory
Andaman and Nicobar Islands 380581 237093 143488 8,249 km2 (3,185 sq mi) 46/km2 (120/sq mi) 876
Andhra Pradesh 49577103 34966693 14610410 162,968 km2 (62,922 sq mi) 303/km2 (780/sq mi) 993
Arunachal Pradesh 1383727 1066358 317369 83,743 km2 (32,333 sq mi) 17/km2 (44/sq mi) 938
Assam 31205576 26807034 4398542 78,438 km2 (30,285 sq mi) 397/km2 (1,030/sq mi) 954
Bihar 104099452 92341436 11758016 94,163 km2 (36,357 sq mi) 1,102/km2 (2,850/sq mi) 918
Chandigarh 1055450 28991 1026459 114 km2 (44 sq mi) 9,252/km2 (23,960/sq mi) 818
Chhattisgarh 25545198 19607961 5937237 135,191 km2 (52,198 sq mi) 189/km2 (490/sq mi) 991
Dadra and Nagar Haveli and Daman and Diu 585764 243510 342254 603 km2 (233 sq mi) 970/km2 (2,500/sq mi) 711
Delhi 16787941 419042 16368899 1,484 km2 (573 sq mi) 11,297/km2 (29,260/sq mi) 868
Goa 1458545 551731 906814 3,702 km2 (1,429 sq mi) 394/km2 (1,020/sq mi) 973
Gujarat 60439692 34694609 25745083 196,024 km2 (75,685 sq mi) 308/km2 (800/sq mi) 919
Haryana 25351462 16509359 8842103 44,212 km2 (17,070 sq mi) 573/km2 (1,480/sq mi) 879
Himachal Pradesh 6864602 6176050 688552 55,673 km2 (21,495 sq mi) 123/km2 (320/sq mi) 972
Jammu and Kashmir 12267032 9064220 3202812 125,535 km2 (48,469 sq mi) 98/km2 (250/sq mi) 890
Jharkhand 32988134 25055073 7933061 79,714 km2 (30,778 sq mi) 414/km2 (1,070/sq mi) 948
Karnataka 61095297 37469335 23625962 191,791 km2 (74,051 sq mi) 319/km2 (830/sq mi) 973
Kerala 33406061 17471135 15934926 38,863 km2 (15,005 sq mi) 859/km2 (2,220/sq mi) 1084
Ladakh 274000 43840 230160 96,701 km2 (37,336 sq mi) 2.8/km2 (7.3/sq mi) 853
Lakshadweep 64473 14141 50332 32 km2 (12 sq mi) 2,013/km2 (5,210/sq mi) 946
Madhya Pradesh 72626809 52557404 20069405 308,245 km2 (119,014 sq mi) 236/km2 (610/sq mi) 931
Maharashtra 112374333 61556074 50818259 307,713 km2 (118,809 sq mi) 365/km2 (950/sq mi) 929
Manipur 2570390 1793875 776515 22,327 km2 (8,621 sq mi) 122/km2 (320/sq mi) 992
Meghalaya 2966889 2371439 595450 22,429 km2 (8,660 sq mi) 132/km2 (340/sq mi) 989
Mizoram 1097206 525435 571771 21,081 km2 (8,139 sq mi) 52/km2 (130/sq mi) 976
Nagaland 1978502 1407536 570966 16,579 km2 (6,401 sq mi) 119/km2 (310/sq mi) 931
Odisha 41974218 34970562 7003656 155,707 km2 (60,119 sq mi) 269/km2 (700/sq mi) 979
Puducherry 1247953 395200 852753 479 km2 (185 sq mi) 2,598/km2 (6,730/sq mi) 1037
Punjab 27743338 17344192 10399146 50,362 km2 (19,445 sq mi) 550/km2 (1,400/sq mi) 895
Rajasthan 68548437 51500352 17048085 342,239 km2 (132,139 sq mi) 201/km2 (520/sq mi) 928
Sikkim 610577 456999 153578 7,096 km2 (2,740 sq mi) 86/km2 (220/sq mi) 890
Tamil Nadu 72147030 37229590 34917440 130,058 km2 (50,216 sq mi) 555/km2 (1,440/sq mi) 996
Telengana 35003674 21395009 13608665 112,077 km2 (43,273 sq mi) 312/km2 (810/sq mi) 988
Tripura 3673917 2712464 961453 10,486 km2 (4,049 sq mi) 350/km2 (910/sq mi) 960
Uttar Pradesh 199812341 155317278 44495063 240,928 km2 (93,023 sq mi) 828/km2 (2,140/sq mi) 912
Uttarakhand 10086292 7036954 3049338 53,483 km2 (20,650 sq mi) 189/km2 (490/sq mi) 963
West Bengal 91276115 62183113 29093002 88,752 km2 (34,267 sq mi) 1,029/km2 (2,670/sq mi) 953
In [36]:
df.index.values[0] = df1.index.values[0]
df.index.values[-5] = df1.index.values[-5]
df.index.values[14] = df1.index.values[13]
In [37]:
df['Population'] = df1['Population']
df['Rural population'] = df1['Rural population']
df['Urban population'] = df1['Urban population']
df = df.drop(['Dadra & Nagar Haveli','Daman & Diu'],axis=0)
In [38]:
df['total_Rural_Hospitals'] = df['NumRuralHospitals_NHP18'] + df['NumSubDistrictHospitals_HMIS'] + df['NumDistrictHospitals_HMIS'] 
df['total_Rural_Beds'] = df['NumRuralBeds_NHP18'] + df['NumPublicBeds_HMIS'] 
df['total_Urban_Hospitals'] =  df['NumUrbanHospitals_NHP18'] + df['NumSubDistrictHospitals_HMIS'] + df['NumDistrictHospitals_HMIS']
df['total_Urban_Beds'] = df['NumUrbanBeds_NHP18'] + df['NumPublicBeds_HMIS'] 
df['total_medical_centres'] = df['NumPrimaryHealthCenters_HMIS'] + df['NumCommunityHealthCenters_HMIS'] + df['TotalPublicHealthFacilities_HMIS']

New columns

  1. total_Rural_Hospitals - total rural Hospital in the state
  2. total_Rural_Beds - total rural beds in the state
  3. total_Urban_Hospitals - total Urban Hospitals in the state
  4. total_Urban_Beds - total Urban Beds in the state
In [39]:
df["Hospitals (per 100000)"]= np.round(100000*df["total_Hospitals"]/df["Population"],2)
df["Beds (per 100000)"]= np.round(100000*df["total_Beds"]/df["Population"],2)
df["rural Hospitals (per 100000)"]= np.round(100000*df["total_Rural_Hospitals"]/df["Rural population"],2)
df["rural Beds (per 100000)"]= np.round(100000*df["total_Rural_Beds"]/df["Rural population"],2)
df["Urban Hospitals (per 100000)"]= np.round(100000*df["total_Urban_Hospitals"]/df["Urban population"],2)
df["Urban Beds (per 100000)"]= np.round(100000*df["total_Urban_Beds"]/df["Urban population"],2)
In [40]:
df = df[['total_Rural_Beds','total_Urban_Hospitals','total_Urban_Beds','total_medical_centres','Hospitals (per 100000)','Beds (per 100000)'
    ,'rural Hospitals (per 100000)','rural Beds (per 100000)','Urban Hospitals (per 100000)','Urban Beds (per 100000)']]
In [42]:
df_india = df_india.sort_index()
In [43]:
df['confirmed'] = df_india['confirmed']
df['recovered'] = df_india['recovered']
df['deaths'] = df_india['deaths']
df['active'] = df_india['active']
df['Mortality Rate (per 100)'] = df_india['Mortality Rate (per 100)']
In [44]:
df=df.dropna()
In [45]:
df.style.background_gradient(cmap='Blues',subset=["Beds (per 100000)"])\
                        .background_gradient(cmap='Reds',subset=["Urban Hospitals (per 100000)"])\
                        .background_gradient(cmap='Greens',subset=["rural Hospitals (per 100000)"])\
                        .background_gradient(cmap='Purples',subset=["rural Beds (per 100000)"])\
                        .background_gradient(cmap='YlOrBr',subset=["Urban Beds (per 100000)"])\
                        .background_gradient(cmap='Oranges',subset=["Hospitals (per 100000)"])\
                        .background_gradient(cmap='Reds',subset=["Mortality Rate (per 100)"])\
                        .background_gradient(cmap='Purples',subset=["confirmed"])\
                        .background_gradient(cmap='Greens',subset=["deaths"])\
                        .background_gradient(cmap='Blues',subset=["active"])\
                        .background_gradient(cmap='Oranges',subset=["recovered"])\
Out[45]:
total_Rural_Beds total_Urban_Hospitals total_Urban_Beds total_medical_centres Hospitals (per 100000) Beds (per 100000) rural Hospitals (per 100000) rural Beds (per 100000) Urban Hospitals (per 100000) Urban Beds (per 100000) confirmed recovered deaths active Mortality Rate (per 100)
State/UT
Andaman and Nicobar Islands 1821 6 1746 65 8.67 609.86 12.65 768.05 4.18 1216.83 33 33 0 0 0
Andhra Pradesh 67279 116 77457 3281 0.62 169.31 0.7 192.41 0.79 530.15 2432 1552 50 830 2.06
Arunachal Pradesh 4456 25 2588 383 16.84 341.4 20.91 417.87 7.88 815.45 1 1 0 0 0
Assam 30059 97 25313 2393 4.08 116.19 4.56 112.13 2.21 575.49 105 42 3 58 2.86
Bihar 23879 179 23732 4216 1.07 28.64 1.09 25.86 1.52 201.84 1392 473 9 910 0.65
Chandigarh 3756 9 4534 89 0.85 429.58 17.25 12955.7 0.88 441.71 196 54 3 139 1.53
Chhattisgarh 19424 89 18696 2002 1.01 93.04 1.09 99.06 1.5 314.89 85 59 0 26 0
Delhi 20572 165 44955 1174 0.98 267.78 13.36 4909.29 1.01 274.64 10054 4485 160 5409 1.59
Goa 4071 30 4274 75 3.22 389.36 3.99 737.86 3.31 471.32 31 7 0 24 0
Gujarat 52844 203 61694 4391 0.94 121.46 1.28 152.31 0.79 239.63 11380 4499 659 6222 5.79
Haryana 20531 111 18391 1314 2.84 98.93 4 124.36 1.26 207.99 912 563 14 335 1.54
Himachal Pradesh 14371 172 15440 1266 12.78 307.45 12.65 232.69 24.98 2242.39 85 41 3 38 3.53
Jammu and Kashmir 18576 105 15759 1607 1.31 187.44 0.94 204.94 3.28 492.04 1183 575 13 595 1.1
Jharkhand 13246 72 12346 1080 1.79 55.13 2.22 52.87 0.91 155.63 225 113 3 109 1.33
Karnataka 77405 563 105426 5697 4.97 207.05 7.1 206.58 2.38 446.23 1246 530 37 678 2.97
Kerala 56376 434 60650 2459 4.24 232.04 6.39 322.68 2.72 380.61 631 497 4 130 0.63
Lakshadweep 550 3 250 17 18.61 853.07 84.86 3889.4 5.96 496.7 0 0 0 0 0
Madhya Pradesh 48160 240 56959 3611 0.79 92.22 0.87 91.63 1.2 283.81 4977 2403 248 2326 4.98
Maharashtra 81396 609 108046 6307 0.78 107.18 0.72 132.23 1.2 212.61 33053 7688 1198 24167 3.62
Manipur 3292 17 3259 218 1.56 155.19 1.84 183.51 2.19 419.7 7 2 0 5 0
Meghalaya 6555 27 7072 347 5.73 304.76 6.58 276.41 4.53 1187.67 13 12 1 0 7.69
Mizoram 2916 46 3705 162 9.3 392.72 12.94 554.97 8.05 647.99 1 1 0 0 0
Nagaland 2574 26 3194 321 2.38 193.28 2.27 182.87 4.55 559.4 0 0 0 0 0
Odisha 22836 211 28677 3536 4.45 83.42 4.91 65.3 3.01 409.46 876 277 4 595 0.46
Puducherry 4558 20 7935 97 1.84 643.53 3.04 1153.34 2.35 930.52 17 9 0 8 0
Punjab 19332 247 25655 1409 2.73 113.4 3.37 111.46 2.38 246.7 1964 1366 35 563 1.78
Rajasthan 72932 247 62604 6181 1.24 122.09 1.36 141.61 1.45 367.22 5375 3068 133 2174 2.47
Sikkim 1405 14 2445 59 6.22 443.02 6.35 307.44 9.12 1592.02 0 0 0 0 0
Tamil Nadu 112795 867 109969 4820 2.16 208.11 2.78 302.97 2.48 314.94 11760 4172 82 7506 0.7
Tripura 6035 77 8172 293 4.79 253.46 4.42 222.49 8.01 849.96 165 85 0 80 0
Uttar Pradesh 97414 367 95466 8070 2.41 67.35 2.97 62.72 0.82 214.55 4464 2636 112 1716 2.51
Uttarakhand 9944 89 11888 727 4.95 150.42 6.38 141.31 2.92 389.86 93 52 1 40 1.08
West Bengal 70847 419 110045 3685 1.85 142.13 2.25 113.93 1.44 378.25 2677 959 238 1480 8.89

7. Findings Hospitals and Beds

  1. States like Andaman and Nicobar Islands ,Sikkim ,Tripura,Himachal Pradesh has a very good number in Urban_beds per 100000 as compared to other states but keeping in mind these states has low population as compared to Uttar pradesh, Maharashtra etc.
  2. Chandigrah has around 13000 rural beds for every 100000 which very good as compared to other region,Delhi also has around 5000 rural beds per 100000 which is also very good this might be a reason that delhi has such low mortality rate
  3. Rajasthan , Tamil nadu ,Uttar pradesh has very low beds per person but still they have very low mortality rate which is good leading us to think that the government of these states has done some good work on social distancing
  4. Madhya pradesh and Maharashtra has high mortality rate also they have low bed and hospitals per 100000 in all aspects
    more results can be inferred so brainstorm on it
In [46]:
import plotly.graph_objects as go

fig = go.Figure(data=go.Heatmap(
                   z=df.corr(),
                   x=df.columns.values,
                   y=df.columns.values,
                   hoverongaps = False))
fig.show()

8. Findings Correlation

  1. Hospitals(per 100000) have a correlation of -.35 with confirmed cases and -.27 with deaths
  2. Urban_beds(per 100000) have a correlation values of -.33 with active cases
  3. Beds(per 100000) has a -ve correlation value with Mortality Rate
In [47]:
df = pd.read_csv('../input/covid19-corona-virus-india-dataset/patients_data.csv')
a = df.detected_district.value_counts()
a = a.dropna()
/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py:3063: DtypeWarning:

Columns (4,10,11,13,20) have mixed types. Specify dtype option on import or set low_memory=False.

But why does Mumbai is worst hit
Maharashtra is bearing the brunt of India’s COVID-19 crisis, with 23% of the total cases and 46% of overall deaths*. Most cases are from Mumbai, with the highest share in its G-south ward. Wards with dense populations have the highest number of cases. The State has the highest number of such wards. for more you can see

In [48]:
!pip install opencage
Collecting opencage
  Downloading opencage-1.2.1-py3-none-any.whl (6.1 kB)
Requirement already satisfied: six>=1.4.0 in /opt/conda/lib/python3.6/site-packages (from opencage) (1.14.0)
Requirement already satisfied: Requests>=2.2.0 in /opt/conda/lib/python3.6/site-packages (from opencage) (2.22.0)
Collecting backoff>=1.10.0
  Downloading backoff-1.10.0-py2.py3-none-any.whl (31 kB)
Requirement already satisfied: pyopenssl>=0.15.1 in /opt/conda/lib/python3.6/site-packages (from opencage) (19.0.0)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /opt/conda/lib/python3.6/site-packages (from Requests>=2.2.0->opencage) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.6/site-packages (from Requests>=2.2.0->opencage) (2019.11.28)
Requirement already satisfied: idna<2.9,>=2.5 in /opt/conda/lib/python3.6/site-packages (from Requests>=2.2.0->opencage) (2.8)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/lib/python3.6/site-packages (from Requests>=2.2.0->opencage) (1.24.3)
Requirement already satisfied: cryptography>=2.3 in /opt/conda/lib/python3.6/site-packages (from pyopenssl>=0.15.1->opencage) (2.3.1)
Requirement already satisfied: asn1crypto>=0.21.0 in /opt/conda/lib/python3.6/site-packages (from cryptography>=2.3->pyopenssl>=0.15.1->opencage) (1.3.0)
Requirement already satisfied: cffi!=1.11.3,>=1.7 in /opt/conda/lib/python3.6/site-packages (from cryptography>=2.3->pyopenssl>=0.15.1->opencage) (1.14.0)
Requirement already satisfied: pycparser in /opt/conda/lib/python3.6/site-packages (from cffi!=1.11.3,>=1.7->cryptography>=2.3->pyopenssl>=0.15.1->opencage) (2.20)
Installing collected packages: backoff, opencage
Successfully installed backoff-1.10.0 opencage-1.2.1
In [49]:
from opencage.geocoder import OpenCageGeocode
key = '157336e14b654bceb51b5fc3cc07bec4'  # get api key from:  https://opencagedata.com
geocoder = OpenCageGeocode(key)
query = 'Bid, India'  
results = geocoder.geocode(query)
lat = results[0]['geometry']['lat']
lng = results[0]['geometry']['lng']
print (lat, lng)
18.9918442 75.909784
In [50]:
from scipy.optimize import curve_fit
import matplotlib.pyplot as pt
In [51]:
def log_curve(x, k, x_0, ymax):
    return ymax / (1 + np.exp(-k*(x-x_0)))
def fit_curve():
    for i in list(collection.keys()):
        if collection[i]['patient_number'].max()>650:
            x_data = range(len(collection[i].index))
            y_data = collection[i]['patient_number']
            popt, pcov = curve_fit(log_curve, x_data, y_data, bounds=([0,0,0],np.inf), maxfev=50000)
            estimated_k, estimated_x_0, ymax= popt
            k = estimated_k
            x_0 = estimated_x_0
            y_fitted = log_curve(range(0,160), k, x_0, ymax)
            fig = go.Figure()
            fig.add_trace(go.Line(x = np.arange(160),y = y_fitted,name ='predicted ' + i  ))
            fig.add_trace(go.Line(x = np.arange(80),y = y_data,name = i))
            fig.show()
fit_curve()
/opt/conda/lib/python3.6/site-packages/plotly/graph_objs/_deprecations.py:385: DeprecationWarning:

plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.


9. Findings curve fit

  1. Until in Delhi today we see 1561 patients and the patient count might saturate in an around more 23 days
  2. For Tamil Nadu the curve shows the saturation way to early in just next 8 days.
  3. In rajasthan things will take around 24 more days to touch the peak but this might lead to around 3000+ve patients
  4. In Uttar pradesh the peak will touch in more 23 days leading to 1000 +ve patients shows that uttar pradesh might have done some progress in flattening the curve
  5. Same situation is in Madhya pradesh as in UP they might have taken some measures to flatten the curve peak will touch in 20 more days
  6. Well I am pretty curious about the situation in Maharashtra the peak will touch in more 26 days but the rate at which the patients number increase will be very heigh until peak we will see around 8500 patients.

New Updates

  1. Zone Division
    starting May 4, the Union Health Ministry has issued a revised list of districts under red, orange and green zones Friday. A total of 130 districts across the country have been placed under the red zone, while 284 and 319 districts have been identified as orange and green zones, respectively. This revised classification is based on the incidence of cases, doubling rate, the extent of testing and surveillance feedback. As per the revised criteria, Union Health Secretary Preeti Sudan, in a letter to state chief secretaries, said, green zones are districts which haven’t reported a fresh case in 21 days, down from the 28 days earlier. The orange zones are those with a few cases, and the red ones have a large number of cases. for more you can see

My Thoughts

  1. I think that increasing lockdown is a good thing to be done as we might have seen in the above cell's output it is important to flatten the curve an the best thing is social distancing
  2. we have seen above that the their is -ve correlation between the beds and confirmed cases so i think more medical facilities might can do the job
  3. Also we should increase the testing should be increased as testing and +ve cases have linear correlation
  4. Government should increase the testing facilities for better testing
    Please provide your thoughts too in the comment sections

Well in all of this negativity we can see some remarkable changes in our environment and air quality
but due to the severe fatalities covid has inflicted on makes me sad irrespective of what is the air quality during the lockdown period.
but i also think that the government has spent a lot of money on to reduce the air,water pollution but we never saw this level of changes
so in my though government should learn from this lockdown and try to regulate their environment saving policies